Web-based possibilistic language models for automatic speech recognition
نویسندگان
چکیده
This paper describes a new kind of language models based on the Possibility Theory. The purpose of these new models is to better use the data available on the Web for language modeling. These models aim to integrate information relative to impossible word sequences. We address the two main problems of using this kind of model: how to estimate the measures for word sequences and how to integrate this kind of model into the ASR system. We propose a word-sequence possibilistic measure and a practical estimation method based on word-sequence statistics, which is particularly suited for estimating from Web data. We develop several strategies and formulations for using these models in a classical automatic speech recognition engine, which relies on a probabilistic modeling of the speech recognition process. This work is evaluated on two typical usage scenarios: broadcast news transcription with very large training sets, and transcription of medical videos, in a specialized domain, with only very limited training data. The results show that the possibilistic models provide significantly lower word error rate on the specialized domain task, where classical n-gram models fail due to the lack of training materials. For the broadcast news, the probabilistic models remain better than the possibilistic ones. However, a log-linear combination of the two kinds of models outperforms all the models used individually, which indicates that possibilistic models bring information that is not modeled by probabilistic ones.
منابع مشابه
Modèles de langage ad hoc pour la reconnaissance automatique de la parole. (Ad-hoc language models for automatic speech recognition)
The three pillars of an automatic speech recognition system are the lexicon, the language model and the acoustic model. The lexicon provides all the words that can be transcribed, associated with their pronunciation. The acoustic model provides an indication of how the phone units are pronounced, and the language model brings the knowledge of how words are linked. In modern automatic speech rec...
متن کاملCombination of probabilistic and possibilistic language models
In a previous paper we proposed Web-based language models relying on the possibility theory. These models explicitly represent the possibility of word sequences. In this paper we propose to find the best way of combining this kind of model with classical probabilistic models, in the context of automatic speech recognition. We propose several combination approaches, depending on the nature of th...
متن کاملProbabilistic and possibilistic language models based on the world wide web
Usually, language models are built either from a closed corpus, or by using World Wide Web retrieved documents, which are considered as a closed corpus themselves. In this paper we propose several other ways, more adapted to the nature of the Web, of using this resource for language modeling. We first start by improving an approach consisting in estimating n-gram probabilities from Web search e...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملDesigning and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods
For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computer Speech & Language
دوره 28 شماره
صفحات -
تاریخ انتشار 2014